An integrated parser for TFG with explicit tree typing
نویسنده
چکیده
One of"the main conditions for the development of successful NLP applications is the usability of the syntactic formalisms adopted and the degree to which they facilitate syntax-semantics integration. TAG+ formalisms show a real potential for NLP applications, due to their linguistic descriptive capabilities. However, both the standard formalisms and parsing strategies are often quite complex and cannot easily be used as such for the development of NLP systems. We have .thus investigated a simplified TAG+ fonnalism, which sacrifices some of TAG's descriptive and formal properties for the sake of usability. Titls is especially relevant considering the recent success of empirical approaches to NLP which tend tobe based on very simple techniques and/or discard linguistically-motivated formalisms [Basili et al " 1996] [Appelt et al., 1993]. We report the implementation of a parser for a simplified TAG+ fonnalism, Tree Furcating Grammars (TFG), which integrales semantic processing, performing both syntactic disambiguation and the construction of a semantic representation for the sentence parsed. The parser has been developed for the purpose of real-time speech understanding of sublanguages (i.e., application-dependent vocabularies of 500-1000 words with specific, sometimes quite simptified, syntactic constructs). TAG+ formalisms were initially investigated because of their potential for syntax-semantics integration (see e.g., Abeille [1994)). We will successively describe the rationale for the TFG formalism, the principles underlying the algorithm used and a first assessment of its perfonnance. Tree Furcating Grammars are a lexicalised TAG+ formalism, in which adjunction is replaced by the furcation operation that essentially adds an additional branch to the target node in the initial tree, instead of copying the auxiliary tree under it. The furcation operation was originally introduced in segrnent grammars [De Smedt & Kempen, 1990]. A detailed comparison of furcation and adjunction has been given by Abeille [ 1991]. Though some syntactic phenomena are not properly bandled by furcation, the fact that it introduces modifiers without embedding them into the tree structure is a definite advantage for syntax-semantics integration, and was the rationale for choosing it 1 • Successive furcations do not increase tree depth and complexity, producing deriv~ trees that retain some properties of dependency trees. These can support the integrated construction of a semantic structure, based on the appropriate association of semantic functions to the tree structures (see below). We have adapted our tree representations accordingly, by distinguishing between left auxiliary trees (which have a *X root node) 2 and right auxiliary trees (X* root …
منابع مشابه
Studying impressive parameters on the performance of Persian probabilistic context free grammar parser
In linguistics, a tree bank is a parsed text corpus that annotates syntactic or semantic sentence structure. The exploitation of tree bank data has been important ever since the first large-scale tree bank, The Penn Treebank, was published. However, although originating in computational linguistics, the value of tree bank is becoming more widely appreciated in linguistics research as a whole. F...
متن کاملTree-grammar linear typing for unified super-tagging/probabilistic parsing models
We integrate super-tagging, guided-parsing and probabilistic parsing in the framework of an item-based LTAG chart parser. Items are based on a linear-typing of trees that encodes their expanding path, starting from their anchor.
متن کاملVanilla: An Open Language Framework
end end total = 0, add = fun(n : Int) total := total + n Parser converts concrete syntax to the abstract representation Small number of language-specific components Large number of components provide abstractions common across several languages Sub-typing is handled separately to allow different sub-type regimes to be explored Interpreter uses type attributes to store type-derived information u...
متن کاملارائۀ راهکاری قاعدهمند جهت تبدیل خودکار درخت تجزیۀ نحوی وابستگی به درخت تجزیۀ نحوی ساختسازهای برای زبان فارسی
In this paper, an automatic method in converting a dependency parse tree into an equivalent phrase structure one, is introduced for the Persian language. In first step, a rule-based algorithm was designed. Then, Persian specific dependency-to-phrase structure conversion rules merged to the algorithm. Subsequently, the Persian dependency treebank with about 30,000 sentences was used as an input ...
متن کاملError Recovery in Parsing Relational Languages
The ability to report syntactic errors and to recover from them are basic requirements for any programming environment where programs are parsed before execution. Advanced error handling techniques are standard tools when processing textual programs, whereas in the context of visual languages the problem is factually unexplored. In this work, we develop an error recovery strategy for the parsin...
متن کامل